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A MATRIX  TR-ATMEKT  OF  THE  GEI'ERAL  PROBLEM  Of  LEAST  SQUARES 
CONSIDERING  CORRELATED  OBSERVATIONS 


ABSTRACT 

The  most  general  type  problem  considered  in  least  squares 
is  formulated  and  solved  with  the  aid  of  matrix  algebra  for  the 
uhich  the  observations  have  the  3eneral  ™ltlvara.te 
normal  distribution.  The  criterion  for  adjustment  is  the  prin 
ciple  of  maximum  likelihood.  Such  related  topics  as  the  inver- 
sion of  the  normal  equations,  variance-covariance  propagation, 

direct  adjustment  of  functions  of  observations,  statistical 
tests  of  significance,  and  the  geometrical  interpretation 
t£e  adjustment  are  considered.  It  is  pointed  out  that  the  results 
of  the  conventional  method  of  least  squares  are  special  cases 
of  the  present  theory. 
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1.  INTRODUCTION 


In  recent  years  several  writers  have  enmloyed  matrix  * 
special  results  for  the  method  of  least  squares.  Dwyer  (l9kli^Tl1  ' 
matrix  algebra  in  considering  the  least  squares  L1J  ewployad 

sion  coefficients.  Along  the  sane  line  Aitken  °r  CXa*fr 

derived  matrix  results  for  3. “ttSSSrt^SriSrtid * 

«»!»»  variance  criterion.  P.rSp,  th.  ncat  exTe^e  Z^To^. 

the  German  geodesist  Gotthardt  (1952)  riil  whose  matrix  _„a.*  to  «^ate  id  that  of 

teS’rSW.’rS-™1 * * **  s"°h  riS*1 ’SS21?)SSi*Si 

sf  j«r^rp]“trii  - •srsrsja* 

imJSXL'Sf Zh'25£S? SS  ?<r 

SA  (1872)  MTllSr^weverf ~LiS^ 

recognised  that  all  SSLPS* JS*  4“U.  ^“hf "-*«• 
point  of  view,  namely,  as  problems  of  constrained  minima  Thus  all  i*  & 
squares  adjustments  consist  basically  of  the  ndnirri  z-itJ  on  nr  * ic  nSt 

the  variables  of  which  are  subject^  cerSn  eSc?oS  fom 

called  condition  equations.  Doming 's  unique  contribution  • -“°n,jtrai"t>  ;he  80 
tion  and  application.  For  instance,  by  fully  uUlirirJ  th  °f  intcrpreta" 

satisfactory  solution  to  the  hitherto  LsolvJd gene£?% nt  00,"e?t  feave  a 
curve  fitting  with  more  than  one  variable  in  ercor,  flom  of  least  ■*a"M 

In  this  paper  we  slAll  errploy  matrix  analysis  to  «xte  L , . _ , 

genorai  problem  of  least  squares  to  the  case  for  which  the  ^ th*  H®f™ert-DemLng 
tne  general  multi-sariate  normal  density.  However  i'"  desi-  0^sef^ation3  have 
.ls»  b.  considered  from  the  Ini™,  v«rt.nce  po"t  ,J  H””145 

tional  method  of  least  squares  and  the  principal  recults  of  th^hev^f  e”™”" 
emerge  as  special  cases  of  the  present  development . above  referenQe3 

!i 


STATEMENT  OF  THE  GENERAL  PROBLEM  ! 

1.  overdeternlned  If  cub.et  1,  £,£&  to  SUSS  (CiJjT,  P* 
d®g”e®  of  freedom  r of  the  set  is  equal  to  the  nuaber  of 

of  th.  minimum  number  n„  reared  to  determine  the  whole  s t n ? *X£*M 
number  of  independent  ocndiV.an  equation  i>»±«t!.!W  between  lo  oK  ~ Jhe 

equal  to  the  degrees  of  freedom  of  the  set.  I*  Spy  Zllx,  ?!T? 19 

instance,  it  is  convenient  to  introduce  p unknown  A ^ filing  for 

relations  between  the  observations.  The  total  nunix>r 

equations  existing  between  the  observations  and  p^n 2°^°“ 
Let  this  set  of  » condition  aquations  be  denoted  by  / “ m r + p* 


(2  J) 


fhere  o 


u <xi  > xz  * • • • ,x„,  c,  ,af,  . 


i » 


, op  are  the  unk 


I * 1,2,..  . 


parameters  and  the  x.  are  adjusted  values 


of  the  original  observations:  that  is 


(2.2) 


X.  = x° 

I I 


ot’terved  value  + res/duo/ 


Since  the  condition  equations  must  beirid^gndont,  it  is  necessary  that  the  rela- 
tions miP)  n im  - p (or  more  cggggcZlij  n + p urn  up)  hold.  If  the  p parameters 
are  not  mutually  indep§£is§Kt7  the  relations  existing  between  them  must  be  in- 
cluded among  jthg.-obndj.tion  equations.  Hence  s ip  of  the  m condition  equations 
may  inyslW "parameters  only.  Introducing  known  approximation  values  a?  for  the 
-parameters  with 


(2.3) 


a,  * a?  -t-  8. 


the  condition  equations  can  bo  written 


(2.4) 


'i  i *!  v,  ,x°2  tvfe  , ■ ■ ■ ,x°n  +vn  , a“  t-B,  , a|+St  , 


•>a;+V 


1.2. 


Assuming  the  v's  and  6's  to  be  sufficiently  small,  (2.U)  can  be  approximated  by 
the  zero  and  first  order  terms  of  its  Taylor  expansion.  The  linearized  condi- 
tion equations  are  thus 


(2.5) 


£ f<i  VJ 


? */  + = 0 , 


j’i  ,aJ  J 


I * 1,2 m , 


ir.  wild  oh 


(2.6) 

fUm 

9fio 

axf 

f‘°j 

_ 9flo 

(2.7) 

(2.8) 

to 

= f(  (X,°,x2% 

.Xn.a 


I 


’<4> 


Let  cr-;  denote  the  variance  of  the  observation  x* . Then  the  general  prob- 
lem of  least  squares  as  considered  by  Helmert  and  Doming  is  to  determine  the 
set  of  residuals  and  parameter  corrections  which  minimizes  the  sum 


(2.9) 


while  satisfying  the  condition  equations  (2.U)  or  equivalently  (2.5).  The  quan- 
tity crt/(T.  is  the  weight  of  the  ith  observation  with  0..  an  arbitrary  constant 
termed  "the  "unit  variance  or  variance  of  unit  weight.  If  the  observational  errors 
have  the  normal  distribution,  the  residuals  so  obtained  are  the  most  probable 
values,  and  the  least  squares  and  maximum  likelihood  adjustments  are  equivalent. 
Suppose,  however,  that  the  errors  have  the  general  multivariate  normal  distri- 
bution, Temporarily  considering  the  residuals  as  the  actual  error's,  the  distri- 
bution is 


(2.10) 


h ( v,  ,v2 


(£)*  (iff-it  '-trir-V' 


where  O’  is  the  covariance  matrix  of  the  observations,  10*  1 the  determinant  of 
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O'-'.  Oiia  V the  vector 

(2.10  VT«  (V,  * ...  v„  ), 


the  superscript  T denoting  transposition.  For  this  case  the  mc*t  probable  set 
of  residuals  is  clearly  that  which  minimizes  the  quadratic  fora 

(2.i2)  s«v7<r'v. 


or  a multiple  thereof,  while  satisfying  the  specified  condition  equations.  The 
classical  method  of  least  squares  corresponds  to  the  special  «*ae  for  which  & 
is  diagonal,  i.  e.,  the  observations  are  mutually  independent  and  are  normally 
distributed.  Henceforth  we  3hall  use  the  terminology  'method  of  least  squares' 
in  a broad  sense  to  denote  the  maximum  likelihood  adjust®"!  t of  observations 
having  the  general  multivariate  normal  distribution. 

The  set  of  linearized  condition  equations  (2,5)  can  be  expressed  in  matrix 
notation  by 

(2.13)  FX#V  + FX„A  + F0  - 0, 


where  in  addition  to  V defined  above  we  have 


hi  hi  ■ ■ 
l hz  ■ 

*N 

• ho 

■ la 

r 

ha,  hof  • • 
la,  la,  • 

> 

c > 

f 

f'  N 

ho 

lo 

(2.14)  FXo  * 

(in  (n* 

V. 

u 

) 

r = 

* 

(no,  (no^  • 

(no. 

J 

>A  * 

h 

< J 

>F0  • 

(no 

s.  J 

The  problem  to  be  considered  is  to  determine,  of  all  possible  vectors  V and  A 
which  satisfy  (2.13),  those  which  result  in  the  nri.nind.zati on  of  (2.12).  Refer- 
ences [2]  and  [3]  treat  basically  the  case  in  which  R is  a square,  diagonal 
matrix  with  (T  nondiagonal( correlated  observations).  In  [l],  [!*],  and  [8],  R 
is  also  square  and  diagonal,  but  the  observations  are  uncorrelated(  O’  diagonal). 
In  addition  references  [U]  and  [8]  also  consider  the  case  in  which  several 
observations  appear  in  each  of  the  condition  equations  but  no  parameters  are 
involved(F-^  rectangular  and  filled  and  ^ nonexistent) . 


3.  THE  NORMAL  EQUATIONS 

Problems  in  constrained  minima  are  most  conveniently  solved  by  the  method 
of  Lagrange  multipliers,  also  called  the  method  of  correlates.  Accordingly  let 

(3.1)  A = ^2  • • ■ 

be  a vector  of  m undetermined  constant  multipliers,  one  for  each  equation  o d 
constraint  (condition  equation).  We  must  then  minimize  the  expression 

(3.2)  s = vTcr'v  -zA'lFt'V  + 15,  A + F0). 

Differentiating  this  with  respect  to  the  free  variables  V and  A gives 
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<(S  - !(VTCr-- A'F,')dV  - UtK'dA  . 


ii  that  ™St  <>q'“1  'er0  f°r  aU  p03alb:u  "nation.  of  ay  and 


v’  a-  - A’  fk  - o , 


f.:a  »o 


Solving  (3.1*)  for  V gives 

(3  6)  V = <T  F£A  , 

and  since  V must  satisfy  (2.13),  we  have 


(3.7) 


( F^d  F^)A  + A + F0  =o. 


The  vectors  A and  A can  be  obtained  by  solving  (3  5)  and  C\  7^ 

and  then  V can  be  determined  from  A)  « j m similtaneoualy, 

single  matrix  equation  l3*6)*  Confcinine  0.5)  and  (3.7)  into  the^ 


\crFx:  inpn  fe- 
fa7  oj  a + o 


srLt^e^rLT,5-6imaoLrre1a3^io5?  f°r «*-*  *«■ 

are  th.  -fata  .quivalant,  of  the  culta  «b2£d’t^££St 


U.  INVERSION  OF  NORMAL  EQUATION  COEFFICIENT  MATRIX 


To  allow  for  this  possibilitv  in  th«  ,!?„?/“  F*f*F*»  wil1  be  singular, 

that  th.  Jaat  S1p  Tth. „ ciifaL  22it“  of  ‘he,n0,ml  equation.  - a...* 
will  then  have  the  form  Rations  involve  the  parameters  only.  ru 


% U 


?n-i  i (n-t  t 


rF< 

a 


00  . . .6 


»h.m  th.  last  . rou,  ar.  coaposad  of  Mr*.  Th.  bn*.n  11*  Indict.,  th. 


~T 


I 


(4.2) 


*.= 


we 

have 

- 

f,ai 

^ •• 

k 

to, 

4a2  . . 

• 

• 

. 

K' 

C-i 

Of  fn-t  o2  . . 

tn-t  a. 

Fa! 

fn-fi  Oftn-i*!  Og  . . 

fm-fi  Op 

inat 

ClOp 

r 

r 

'Ao>  ' 


(4.3) 


'a, 

>.o 

K t 

fco 

A«-» 

• el 

= 

J).  «. 

0 

fm-t*  1 0 

Am 

0 

% 

Fj 


From  (U.l)  it  follows  that 

(4.4)  FXo<7  FXJ  = 


Fx'c  <T  F”  O 


and  this  substituted  into  the  general  normal  equations  along  with  (U.2)  and 
(U.3)  gives  ' 


(4.5) 


F*[  O FXJ  O 

o o 


F*Y 


% 

0 


A! 

0 

A* 

Fo* 

= 

0 

A 

5>J 

0 

or  in  a more  convenient  arrangement 


(46) 


K V F'J 


z: 

o 


o 

z 

o 


A1' 

■*  1 

*? 

o 

r 

A 

0 

= 

0 

A* 

0 

This  arrangement  of  the  noxml  equations  is  generally  suitable  for  elimination 
solutions.  If  such  a solution  should  break  down  at  any  point,  i.  e..  division 
” ro  occurs,  it  is  merely  necessaxy  to  delete  the  row  and  colunn  where  thT 
difficulty  occurs  and  remove  them  to  the  end  of  the  matrix,  taking  care  to 
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in3tInceheif°prffPp0|Tding  un^icw?  al»°*  Such  a breakdown  would  happen,  for 

* lf  ”ere  singular,  or  for  that  natter,  if  any  square  submatrli 

containing  an  entire  left  hand  section  of  the  diagonal  were  singular. 

coeffioie!5eL^vefLVna^Sef*1,6eneral  exPr*ssion  for  the  inverse  of  the 
the  eneulpg  w hi  Prt°r  *°  *dJ“5t*“t' 


G '-HTQH 

HT0 

-HTKTL-' 

(4.7) 

N 1 = 

OH 

-o 

KTL‘ 

_ -L"' KH 

L"'K 

L' 

in  which 

(4.8) 

(4.9) 

(4.10) 

(4.1 1) 

(4.12) 

(4.13) 

(4.14) 


0 

6 

H 

J 

K 

L 

Q 


V'CT 
D F‘B' 

fa>:g- 

hf>: 

Fa*J> 

K F*T 

*o 

J'-KTL'K 


From  (U.6)  and  (U.7)  the  roots  of  the  normal  equations  are 
(4  i5)  A'  = -(G  '-HTQH)  F0‘  + (HTKTL')F0* 

(4.16)  4 = ~(QH)F0'  -(KTC)F0‘ 

(4.17)  A*  - (U'KH)FJ  -(L')Fo  . 

JfbrsiSSrsL"^^  ^SJ^SSS^  iL^pr£Vor  6 or  J 

condition  equations.  ^ * f tW"  1 a 3urod  independence  of  the 

Sone  relations  among  the  auxiliary  matrices  (U.9)  - (|*.U*)  are 

(4  i8)  H GHT  * J 

(4  19)  KJKr  - L 

(4  20)  OJO  - 0. 

t2.tii.-JS*.1”  &2>  ffiSr  «&« 
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1 


parameters.  Hence  for  the  general  case  considered  in  which  the  parameters  are 
not  necessarily  independent,  it  is  to  be  expected  that  Q be  singular  and  of 
rank  p-s. 


5.  COVARIANCE  MATRICES  RELATED  TO  THE  ADJUSTMENT 
Let  Y and  Z be  arbitrary  vectors  of  variates  with 

(5.1)  YT » (y,  y2  . . . y„) 

(5.2)  ZT  - (Z,  Z2  • • • zx  ) . 

Then  clearly 

z<  X z*  • • • X 

z.  y2z2  y2zl 

> 

z,  x,z*  • x,zL 

and  we  define 


as  the  covariance  matrix  of  the  vectors  Y » Z.  It  readily  follows  that 

(5.5)  (7yzt  = (Cry)7. 

If  Y = Z we  shall  simply  refer  to  (5.U)  as  the  covariance  matrix  of  Y. 

New  let 

(5.6)  U*  * U,(Xf  ,XJ  .,  X*  ) , • , 

be  arbitrary  functions  of  the  observations  x°  . Differentiation  gives 

(5.7)  du*  * fu Jdx;, 

where  if.  is  the  partial  derivative  of  u“  with  respect  to  xj . Regarding  <WJ  as 
the  error  in  u°  resulting  from  errors  dxj  in  the  observations,  the  covariance 
of  du^,  du^  and  hence  that  of  Ug,  u^  is  J 

(5  8)  ■ l * u;,  Oi,  ur,  , 

where  0{ jj  denotes  the  covariance  of  x- , x!j . Letting 


(5.4)  crr2T  = 


k2 

1 1 i i 

°y.**  • 

• V, 

°y  z 
J2  1 

Vi 

crv  , 

L y* 1 

*v« 

x 

(5.3)  YZT  = * 

y* 


ii 


U,  <Xf,  *!,  ...  ,x°n) 

.5.9,  Uo  = «*’  . X 

“q  <*?  ,XJ X») 

we  can  express  (5.7)  as 

(5.10)  dUo  = ( 4rUl)T  dX0  --  Ux'dX' 

and  it  follows  readily  from  (5.6)  that 


a •JU 

= . 


(5.11) 


- uXoaXoXAJi  - uX(au’ 


is  the  covariance  matrix  of  U,.  By  means  of  this  result  the  covariance  matrix 
of  any  vector  of  functions  of  the  observations  can  be  determined. 

Of  particular  interest  is  the  covariance  matrix  of  the  unknowns  in  the 
normal  equations . Letting 

(5.12)  WJ  = (Ar  AT  AiT)  , 

(5.13)  Cl  - ( F"  0 F0tT), 

the  solution  of  the  normal  equations  can  be  written 


(5.14) 


Wo  = -N~'C0 


Since  N is  essentially  unaffected  by  the  observational  errors  and  hence  may  be 
considered  constant,  we  have  from  (5.U;) 


(5.15) 


dW0  ---N-'dCo 


(5.16) 


dC„  ■ (jgCD’M.  - (§-?<’  0 


But  by  definition 


(5.17) 


d plT  r PIT 

ax0  0 


and  since  (j2  involves  parameters  only, 

(5.10,  <sr  ■ 0 • 

aX0 


Hence  in  (5.10 

(5 19)  dc0-  (f;;  o o)'dx0 
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From  (U.7),  (5.15)  and  (5.19)  it  follows  that 


(5.20) 


dW0 


G'-  HTOH 
OH 
- W KH 


F^dX0  * KSr.dXo. 


Applying  (5.10)  and  (5.11)  yields 


G'-HT0H " 

G1-  HTQH 

(5.21) 

CL  «T  r 
"0  W0 

OH 

Fx[  (J  FX‘J 

OH 

~ L> 

-L1  KH 

which  reduces  to 


1 

G'-HT0H 

0 

-HTKTLr 

^A'A1T 

(5.22) 

= 

0 

0 

0 

= 

^AAtT 

1 

-L'KH 

w 

0 

L* 

Pawt  ®AlAT 

Oa?ait 

Conparlng  this  result  with  (U.7)  shows  that  the  covariance  matrix  of  the  un- 
knowns can  be  obtained  directly  from  the  inverse  of  the  normal  equation  co- 
efficient matrix.  Notice  also  that  the  covariance  matrices  of  the  vectors  A1,  A 
and  A2, A are  zero.  Since  by  (3.6)  V is  a function  of  A alone,  it  follows  that 
the  residuals  and  parameter  corrections  are  mutually  independent. 

To  derive  the  covariance  matrix  of  the  residuals  we  use  (3.6),  (U.l), 

(U.3),  (U.8)  and  (U.lS)  to  write 

(5.23)  V --  (TF"A ' --  DTA'  --  -DT((G-'-HTQH)F0'  - (LIKH/F0‘). 

Hence  by  (5.17)  and  (5.18) 

(5.24)  dV  = - DT(G  ‘-  H TQ H)  Fx'o  dXo  » V*,  dX o , 
which  leads  to 

(5  25)  ( Tvvr  --  VXo<TVJ0  - DT(6  '-HTQH)D  * DTC£/rD. 

From  this  and  (5.2U)  we  see  that 

(5.26)  vxa  avx:  -vxavx\  -<rvyT  ■ 


We  shall  next  determine  the  covariance  matrix  of  the  adjusted  observations. 
This  matrix  is  of  primary  importance,  since  conparing  it  with  the  covariance 
matrix  of  the  original  observations  enables  one  to  gage  the  inprovement  effect- 
ed by  the  adjustment.  By  (2.2)  we  may  write 

(5  27)  X * Xg  f V . 
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where  X denotes  the  vector  of  adjusted  observations.  Ed f f e rent! ati on  gives 

(5.28)  dX  = (I  + VXo)dX0  , 

and  using  (5.26)  the  covariance  matrix  of  X turns  out  to  be 

(5.29)  <Txxr  = O'  ~ CTvvr  . 

This  shows  that  the  covariance  matrix  of  the  adjusted  observations  is  equal  to 
the  covariance  matrix  of  the  original  observations  minus  that  of  the  residuals. 

the  adjustment  is  corrpleted  it  is  often  required  that  functions  of  tte 
adjusted  observations,  or  more  generally  of  the  adjusted  observations  and  para- 
meters, be  evaluated  and  that  their  variances  be  determined.  The  latter  can  be 
achieved  through  a reinterpretation  of  (5.6)  - (5.11).  Let 

(5.30)  u(  --  u,  (x,  ,x„  ,o,  ,a2,  , ap  j , . * '.z.  .t  , 


be  arbitrary  functions  of  the  adjusted  observations  and  parameters  and  let  U be 
be  the  vector  of  the  U:  . Then  with  obvious  notation 


(5  31)  dU  --  Ux  dX  + UA  dA  - (Ux  UA) 

and  since  by  (2.3)  dA  = dA  , we  have 


dX 

dA 


(5.32)  aUUr^(ux  uA ) Ogjgj  t (ux  ua)t. 


The  covariance  matrix  of  the  vector  (XTATJT  ill  readily  found  to  be 


cr  avvr 

-DTHT  0 

O’,*'  0*4' 

(5.33) 

"feier  ’ 

-Q  HD 

0 

PflXT  &A  V 

and  this  result  along  with  (5.32)  may  be  used  to  determine  the  covariance  matrix 
of  any  vector  of  functions  of  the  adjusted  observations  and  parameters. 


6.  DIRECT  ADJUSTMENT  OF  FUNCTIONS  OF  OBSERVATIONS 

In  section  2 it  was  assumed  that  the  xj  were  the  original  observations. 

We  now  relax  this  requirement  by  considering  the  i”  as  independent  functions  of 
(perhaps)  more  elemental  observations  x"  which  have  a known  multinomial  dist- 
ribution. Accordingly  we  write  J 


(6.D  x?  . *,*;*!, 

Differentiating  this  gives 
(6  2)  dxf  * £ dij 


in  which  dx,  may  be  regarded  as  the  error  in  the  derived  observation  x ■ result- 
ing from  errors  dxj  in  the  elemental  observations  Sc?  . Writing  (6.2)  in  terms 
of  residuals  rather  ttwi  differentials  gives 
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(6  3)  V,  * £ &r  v.  t 

*•1  ' 

or  in  matrix  notation 


£ti£lV J ',eJaa'Xn?S^t  the  adJttflted>  dertved  observations  mast 
“eSvX  equation  (2.1).  However  since  the  distribution  of  the 

ellmlntal  1 V®!?0?  known»  we  nust  use  the  distribution  of  the 

STiS  2^5222?  5°  the  adJuatment-  f*>™  (2.13)  and  <6.u)  it  m- 

u2s  are^  U ed  conditlon  equations  in  terms  of  the  elemental  resid- 

(6.5)  FXo*$  + F^A  i-  F0  ==  0 , 

to'th^cmjtralnt*^!?)^™”4*1  ”0ld“1S  ar°  ‘*'Ui'“d  by  ->!•* 

(6.6)  S = VT&~'V 

in  which  CT  is  the  covariance  matrix  of  the  elemental  observations. 

Procedure  of  section  3 with  regard  to  (6.5)  and  (6.6)  we 
arrive  at  the  following  results.  The  most  probable  elemental  residuals  are 

(6.7)  V = O’  QT  FJ  A , 

*o 

and  A and  A are  obtained  from  the  normal  equations 

[fXo*&<s>tfxt  fJ  lit]  fk]  [ol 

L 5.  OJ14J  l°J  L° 

From  (6.U)  and  (6.7)  the  derived  residuals  are 

(6.9)  V = <&(7«rFTA, 

*0 

and  since  the  covariance  matrix  of  the  derived  observations  is 

(6.10)  CT  --  Q&QT, 

equations  (6.8)  and  (6.9)  can  be  written 
f Fx  CT  FXT  F ] f A]  \F0\  fOl 

(6.U)  0 0 *°  -h  0 = 

FT  O A 0 0 


(6.12) 


V « <JFJA 


But  these  results  are  identical  respectively  with  (3.6)  and  (3.8)  which  were 
obtained  by  minimizing  VT0"~'V  subject  to  the  same  condition  equations.  Hence 
it  is  possible  to  adjust  derived  observations  directly  and  without  modification  by 
the  procedures  developed  earlier.  From  this  we  might  infer  that  the  derived 
observations  also  have  a multinomial  density  with  covariance  matrix  given  by 
(6.10).  A direct  proof  of  this  is  given  in  the  next  section. 


S 


If  the  derived  residuals  ere  obtained  from  (6*11)  and  (6.12),  the  problem 
remains  of  detendninc  the  elemental  reel  duals.  This  can  be  accomplished  by 
multiplying  (6.9)  by  a ^Tr*  which  gives,  according  to  (6.7), 

(6.13)  v * cr*'(*G,W  V . 

The  interesting  feature  of  this  result  Is  that  it  la  the  same  thing  as  the  least 
squares  adjustment  of  the  elemental  observations  subject  to  condition  equations 
given  by  (6.U)  in  vhich  the  vector  of  derived  residuals  is  assumed  to  be  known. 
j Hence  once  the  derived  residuals  have  been  obtained,  the  elemental  residuals 

can  be  determined  from  a new  3 east  squares  adjustment  in  which  the  condition 
equations  are  the  relations  between  the  elemental  and  derived  observations. 

The  results  of  this  section  free  us  of  the  sometimes  cumbersome  restriction 
of  having  to  adjust  the  elemental  observations  directly.  Moreover,  by  Judi- 
cious formulation  of  a given  problem  and  choice  of  derived  observations  effi- 
cient solutions  can  often  be  developed.  Sometimes,  for  instance,  it  is  pos- 
sible to  calculate  a single  covariance  matrix  for  the  derived  observations, 
which  can  be  used  for  a series  of  different,  successive  adjustments.  In  other 
cases  efficient  approximation  solutions  can  be  developed.  For  example,  if  the 
diagonal  terms  of  C T strongly  predominate,  it  may  be  possible  to  ignore  the 
nondiagonal  terms  completely  without  altering  the  final  results  significantly. 

An  application  of  this  principle  in  the  field  of  photogrammetric  triangulation 
is  given  by  Brown  [13  . 


7.  JOINT  DISTRIBUTION  OF  INDEPENDENT  LINEAR  COMBINATIONS  OF  VARIATES  FROM  A 

MULTINOHMAL  DENSITY 

For  present  purposes  we  shall  consider  9 and  V as  vectors  of  the  actual 
errors  in  the  elemental  and  derived  observations  respectively.  By  hypothesis 
the  distribution  of  the  elemental  errors  is 

(7.D  ^ . 

We  shall  show  that  the  joint,  marginal  distribution  of  the  derived  errors, 

V = <$V,  is  a multinomial  density  with  covariance  matrix  given  by  (6.10). 
Equation  (6.U)  may  be  rewritten  as 

(7.2)  V - V,  + ♦jV, 

in  which  is  the  square  matrix  defined  by  the  partition 


K 

■ ■ 

1 4*!  n+i 

■ K 

(7  3) 

« * (♦,  !♦«;  -- 

4>zz 

• • 

i 

J 4*2  n+i  ■ 

l 

■ 4>Zk 

• • ^nn 

1 4>n  nti  • 

— ^ 
c 

• -6- 

and 

(7  <) 

VT*  fV,T!  V//  -- 

v* 

A 

• • • v„ 

1 Vn+i 

■ • vk)T 

-l'sco  the  v's  are  independent,  we  nay  assume  that  the  variates  have  been  so 
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k> 

/ 


ordered  that  $,  Is  nonsingular.  Solving  (7.2)  for  V,  in  terms  of  the  n 
derived  errors  and  remaining  k - n elemental  errors  gives 

{75)  y ; --  *;'(v  - *2v2) , 


and  according  to  (7.U)  this  allows  us  to  write 


(7.6) 

Letting 

(7.7) 


V 

V 

W " 

« r 

0 

i 

k. 

3 _ / a a a it 

a\t  lav(  av2  avn>  • 


the  Jacobian  of  the  transformation  C7.5)  is 


(7.8) 


and  the  joint  distribution  of  the  v'3  and  the  last  k-n  v's  is  thus 
(79)  0 Tv,  ,5m  , j * h ( 5 ,vt  , . . . ,5, ; 

■ (Ms 

which  according  to  (7.6)  may  be  written 

(7.!°)  a - (i*T'izia-'i)t  e-twT  ViT)a"(vT  v/)r. 


This  is  a multinomial  density  in  which 


1 _ -1 
- f i . . - 

*r' 

i 

fa;,  &lt 

-I 

w 

o , 

a),  ai* 

— > 

o 

(7.11) 


iitri?  ^^^^  ^ the  covariance  matrix  of  the  vector  (VT  $J)T.  The  middle 
matrtce^of  its  factors?^  * partitioned  to  be  conformable  with  the  sub- 

grati^oUtlhTlif' stfJb"tlon  v's  in  (7.10)  can  be  obtained  by  inte- 

nnitinommi  Sltritaw™  It  W ( Mood  M ) this  will  lead  to  a new 

StrikiS^ut^he^st £ covariance  matrix  of  which  can  be  obtained  by 
St  k_n  r0MS  and  oo1^  of  n.  To  obtain  n from  Jl'^we 


•r' 

-i  j 

•/ 

0 i . 

0 

i 

ari  invert  7.11)  applying  the  reversal  rule.  The  reduced  result  is 


I 
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(7.13) 


o,  a„&j  + ©, 


which  by  the  partitioning  of  $>  and  O’  can  be  written  more  compactly  as 


(7.14) 


0(T0T  ©or 

d/©r 


where  CT2  is  defined  by 


Hence  by  (7.1h)  the  covariance  matrix  of  the  v's  in  the  marginal  distribution  is 

(7.16)  cr  = ©d"©r. 

which  agrees  with  (6.10).  The  marginal  distribution  of  the  derived  errors  is 
thus 

(7.17)  h(vt  ,v2  . (;L)9  (ia-'i)t  e-^vTa  'v 

which  is  the  result  we  set  out  to  establish.  Since  the  covariance  matrix  of 
the  set  of  variates  is  unique,  this  result  could  have  been  established  directly 
from  (6.10)  and  the  fact  stated  above  that  a marginal  distribution  from  a 
multinomial  distribution  is  also  a multinomial  distribution. 


8.  EVALUATION  OF  THE  QUADRATIC  FORM  OF  THE  RESIDUALS 


While  S can  always  be  computed  directly  from  the  definition  (2.12),  this 
is  not  always  convenient,  especially  if  O"  is  nondiagonal  and  difficult  to  invert- 
or if  the  vector  of  residuals  is  not  otherwise  required.  It  is  possible  to  de- 
rive alternate  expressions  for  S which  involve  neither  V nor  O’  and  which  are 
essentially  byproducts  of  the  solution  of  the  normal  equations.  Thus,  starting 
with  the  definition  (2.12)  and  employing  (5.23)  and  (U.9)  yields 


(8.1)  S = VT<r'V  = (ATF^CT)  0"'«TFX'£)  * ATQA ‘ . 


A sin^ler  form  results  from  this  if  we  successively  enploy  the  relations 


(8  2/ 


GA'  = 


^0  + 


F^A), 


.'TFA'  = 


-A’ 


2T 


Fz 


F2  A 


which  follow  directly  from  the  set  of  normal  equations  (It. 6). 
is 


(8  3)  s = - A,TF0'  - A2TF02  = - A7  F0  , 


The  reduced  result 


which  corresponds  to  the  expression  given  by  Deming  [llj  (equation  17,  p.  57) 
for  the  general  uncorrelated  case.  This  result  provides  a convenient  starting 
point  for  the  derivation  of  still  other  expressions  for  S which  may  be  useful 
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in  special  cases.  For  the  general  problem,  however,  it  is  doubtful  whether  a 
simpler  or  more  convenient  formula  than  (8.3)  exists. 

When  the  original  condition  equations  have  been  linearized  by  Taylor'3 
series,  a sequence  of  iterations  of  the  solution  may  be  necessary  to  remove  the 
influence  of  neglected  higher  order  terms.  In  thi3  case  the  final  value  of  the 
vector  V is  the  sum  of  the  Initial  V and  those  obtained  from  successive  itera- 
tions. The  same  is  true  of  the  vectors  A , A and  F0  . Equation  (8.3)  is 
not  strictly  valid  when  A and  F0  result  from  iterations.  It  should,  however, 
provide  an  accurate  approximation  in  most  caseB.  An  exact  expression  for  S when 
N it?A^i°nS  been  performed  may  be  derived  from  the  second  equality,  of  (8.1). 
Let  Ui  and  vFXo)j  be  the  values  of  A1'  and  FXj  corresponding  to  the  l h itera- 
tion with  the  values  for  the  initial  solution  corresponding  to  i=0.  Then 

(8  4)  S = BT  CT  B , 

where  g is  the  vector 

(8.5)  BT=  t (A,T)|(FX'  );  . 

The  principal  merit  of  this  result  is  that  it  does  not  require  the  inversion  of 
CT  • If  this  is  not  a serious  problem,  it  may  be  preferable  to  employ  the  first 
equality  of  (8.1),  using  the  final  vector  of  residuals. 


9.  TESTS  OF  SIGNIFICANCE  FOR  THE  ADJUSTMENT 

DemLng  (1935)  [l3]  has  shown  that  the  quadratic  form  of  the  residuals  ob- 
tained from  the  least  squares  adjustment  of  uncorrelated,  normally  distributed 
observations  has  a X distribution  vith  r * n - nc  degrees  of  freedom  (see  first 
paragraph  of  section  2 of  this  report).  By  means  of  a transformation  of  the  re- 
siduals it  can  be  shown  that  this  result  holds  for  the  correlated  case  as  well. 
Thi3  fact  may  be  used  to  provide  a test  of  significance  for  the  adjustment.  Es- 
sentially, the  test  determines  whether  the  estimate  of  unit  variance  obtained 
from  the  adjustment  is  compatible  with  the  pre-established  value.  We  set 

(9.1)  X|2  = S 

and  determine  the  probability  P(x2  > \2,  ; r degrees  of  freedom)  from  a table  of 
the  X distribution.  If  this  probability  is  unreasonably  small  (or  large, 
though  this  would  rarely  occur  in  practice),  a poor  adjustment  is  indicated  and 
an  effort  should  be  made  to  determine  and,  if  possible,  correct  the  cause.  Among 
the  principal  reasons  for  an  unsatisfactory  x2  result  are: 

(a)  Computational  errors:  Though  mention  of  this  possibility  may  seem 
trivial,  it  is  felt  that  the  correctness  of  the  computations  should  be 
established  before  seeking  other  explanations.  This  is  especially  true 
if  the  adjustment  is  routine  and  has  been  consistently  successful  before, 
or  if  P(X  > X,  ) turns  out  to  be  so  extreme  that  the  other  possibilities 
to  be  mentioned  seem  unlikely. 

(b)  Uncorrected  systematic  errors  in  the  observations:  An  investigation 
into  all  phases  of  the  measuring  operation  is  necessary  to  evaluate  this 
possibility.  Special  instrumental  calibrations  provide  the  means  for 
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correcting  such  'errors'.  It  should  be  pointed  out  that  a satisfactory  X2 
result  does  not,  in  itself,  preclude  the  existence  of  systematic  errors, 
especially  if  they  are  constant  or  nearly  so. 

(c)  Inadequate  or  incorrect  condition  equations:  The  condition  equations 
represent  a mathematical  model  of  a physical  situation  and  as  such  are 
satisfactoiy  only  if  they  actually  approximate  the  true  situation  to  a 
degree  compatible  with  the  accuracy  of  the  measurements.  In  many  cases, 
as  measuring  accuracy  increases,  more  complex  models  become  necessary  in 
order  to  account  for  previously  insignificant  factors.  If  a model  is  in- 
adequate, it  may  well  show  up  in  the  x2  test.  However,  since  inadequate 
models  can  result  from  systematic  errors,  the  remarks  of  (b)  hold  here 
also. 

An  adequate  set  of  condition  equations  may  lead  to  an  inadequate  set  of 
linearized  condition  equations  due  to  the  influence  of  neglected  higher 
order  terms.  The  residuals  or  parameter  corrections  may  be  so  large  that 
ordinary  iterations  actually  cause  the  solution  to  diverge  or  to  converge 
to  an  incorrect  result.  When  such  dificulties  result  from  poor  approxima- 
tion parameters  alone,  the  method  of  ' darrped  least  squares'  developed  by 
Levenberg  (19Ub)  (Til  may  be  useful  (this  is  discussed  somewhat  in  the  next 
section),  while  if  the  residuals  also  are  too  large,  a gradient  method  of 
minimization  described  by  Curry  (19UU)  [18]  may  lead  to  a satisfactoiy 
solution. 

Aside  from  inadequate  condition  equations,  which  may  be  regarded  as 
poor  approximations,  incorrect  equations,  which  do  not  approximate  the 
physical  situation  at  all,  may  be  included  among  the  set  of  condition 
equations.  Incorrect  condition  equations  result  from  outright  mistakes 
or  from  faulty  analysis  and  can  be  expected  to  affect  the  X2  result  ad- 
versely. 

In  complex  measuring  situations  the  most  difficult  problem  may  not  lie 
in  the  actual  adjustment,  for  this  can  be  done  straightforwardly  by  the 
methods  of  this  paper,  nor  in  obtaining  correct  condition  equations,  but 
rather  in  the  determination  of  the  degrees  of  freedom  of  a set  of  observa- 
tions and  hence  of  the  number  of  condition  equations.  We  may  speak  of  an 
incomplete  or  overconplete  set  of  condition  equations  according  to  whether 
less  or  more  than  the  correct  nuriber  are  chosen.  Assuming  the  individual 
condition  equations  to  be  correct,  an  overconplete  set  will  lead  to  a 
singular  set  of  normal  equations.  An  inc omplete  set,  on  the  other  hand, 
may  yield  a solution  and  may  or  may  not  result  in  a poor  X2  test,  depend- 
ing upon  the  importance  of  the  omitted  condition  equations. 

(d)  Inaccurate  covariance  matrix  of  the  observations:  In  order  to  make  the 
x2  test  it  is  necessary  to  assume  that  CT  is  accurately  known.  Actually, 
in  practice,  an  estimate  of  (T , generally  derived  from  replicated  obser- 
vations, is  U3ed.  In  order  to  have  confidence  in  such  an  estimate  the 
degrees  of  freedom  upon  which  the  estimates  of  the  elements  of  0 T are 
based  should  be  reasonably  large,  say  greater  than  20.  In  many  problems, 
the  elements  of  the  covariance  matrix  of  the  observations  may  be  known 
precisely  to  a constant  multiple,  0~,  the  variance  of  unit  weight:  that 
is  (T-  cr.oCTo  where  &o  is  known  precisely.  In  this  case  only  0C«  need  be 
estimated,  and  an  alternate  test  of  significance  described  below  may  be 
used  to  compare  the  least  squares  estimate  of  unit  variance  with  a given 
prior  estimate. 

From  the  least  squares  adjustment  under  consideration  we  may  obtain  an 
estimate  (CC.),  of  (T..  with  (note  that  if  a„is  unknown,  8 - VTOf'V  = Ot.VTCr"'V  ) 
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(9.2) 


Hence 

(9  3) 


(0-oo), 


S_  _ X2  (Too 

r r 


( (Too), 

fj  (Too 


in  which,  for  the  sake  of  uniformity,  we  have  used  r,  rather  than  r to  denote 
the  degrees  of  freedom.  Now  let  (Oio)o  be  an  estimate  of  Olo  obtained  from  a 
least  squares  adjustment  independent  of  that  from  which  (C^0)|  was  obtained. 
(Note  that  the  usual,  straightforward  method  of  computing  the  variance  of  a set 
of  repeated  observations  from  the  sum  of  squares  of  deviations  from  the  mean  is 
essentially  an  estimate  based  upon  a least  squares  adjustment.)  Letting  the 
degrees  of  freedom  associated  with  such  an  independent  estimate  be  r„ , we  may 
write 


(9.4) 


x| 

To 


( 0~oo)o 
CToo 


We  then  form  the  ratio  for  the  F distribution 


(9.5) 


x^/r,  _ (.Ooo), 

Xl/rz  = ( C£o)0  ’ 


The  value  P(F  AF„  j r,  , r0  degrees  of  freedom)  provides  a test  of  the  compat- 
ibility of  the  two  estimates  of  unit  variance  and  should  be  used  in  place  of 
the  X2  te3t  when  the  unit  variance  i3  not  accurately  known  beforehand.  When 
r„  is  large  the  X2  and  F tests  lead  to  similar  results. 


When  it  i3  not  possible  to  obtain  an  independent  estimate  of  (Too,  the 
above  test3  cannot  be  applied.  In  such  cases  the  adjustment  is  eirqplcryed  to  ob- 
tain the  estimate  given  by  (9.2).  Indeed,  this  may  even  be  the  primary  purpose 
for  the  adjustment.  It  must  be  kept  in  mind,  however,  that  for  such  an  estimate 
to  be  valid  it  must  be  known  that  such  factors  as  have  been  mentioned  above  do 
not  influence  the  result  significantly. 

Mistakes  can  often  be  localized  and  the  nature  of  systematic  errors  and  model 
deformities  revealed,  through  a study  of  the  individual  residuals  and  parameters. 

Although  the  x*  and  F tests  described  are  applicable  to  the  great  majority  of 
problems  involving  physical  measurements,  more  general  methods  are  required  when 
the  conditions  underlying  their  application  are  not  fulfilled.  In  this  regard  we 
shall  merely  mention  that  such  methods  are  provided  by  tests  based  upon  the  Wish- 
art  and  related  distributions.  A derivation  of  the  Wisbart  distribution  together 
with  a study  of  its  properties  and  applications  is  given  by  Wilks  (19U3)  [19]  • 


10.  GENERAL  HEMAHKS 

Consideration  of  the  general  least  squares  adjustment  and  related  problems 
in  terms  of  matrix  algebra  provides  a broad,  uncluttered  concept  of  the  proced- 
ures and  operations  necessary  in  the  reduction.  Since  all  problems  in  least 
squares  are  merely  special  cases  of  the  general  problem,  there  is  no  need  for 
the  classification  of  problems  into  distinct  categories.  It  i3  probably  this 
conpartmenting  of  least  squares  which  so  long  delayed  the  solution  of  the  gener- 
al curve  fitting  problem  and  has  otherwise  retarded  the  application  and  inter- 
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pretation  of  the  method. 

In  some  problems  the  distinction  between  observations  and  parameters  is  not 
Clear  cut.  A parameter  may  have  a physical  interpretation  and  be  capable  of 
direct  measurement.  In  such  cases  an  approximation  value  of  the  parameter  is 
sometimes  obtained  from  a direct  measurement.  But  if  such  a quantity  is  actual- 
ly measured  and  has  a probability  distribution,  it  should,  strictly  speaking,  be 
treated  as  an  observation.  In  practice,  however,  a measured  quantity  is  often 
regarded  as  a parameter,  rather  than  an  observation,  when  it  has  a very  large 
variance  compared  with  that  which  would  result  from  calculating  the  quantity 
from  the  other  observations.  It  might  thus  be  conjectured  that  in  a least 
squares  adjustment  a parameter  may  be  regarded  as  an  observation  with  infinitely 
large  variance.  It  turns  out  that  this  consideration  leads  to  the  same  results 
as  the  original  development,  providing  it  be  postulated  that  an  observation  of 
infinite  variance  contributes  nothing  to  the  degrees  of  freedom  of  a set  of  ob- 
servations . (Since  variates  must  have  finite  variances  in  order  to  have  the  raul- 
tinormal  distribution,  this  discussion  should  be  considered  only  in  a heuristic 
sense) . This  concept  allows  the  formulation  of  approximation  procedures  in 
which  observations  with  relatively  large  variances  are  treated  as  parameters. 
Conversely,  treating  parameters  as  observations  with  relatively  large  variances 
can  lead  to  useful  results.  Levenberg  (l9Ui)  [17] , for  example,  used  this  con- 
cept implicitly  in  deriving  the  method  of  'damped  least  squares,'  which  is  use- 
ful  when  the  usual  least  squares  solution  fails  to  converge  due  to  poor  approx- 
imation parameters.  In  essence,  Levenberg  showed  that  by  treating  the  param- 
eters as  observations  with  appropriate  variances  (a  method  for  calculating 
optimum  variances  was  given)  the  solution  can  be  made  to  converge  to  the  correct 
result.  Normally,  with  each  iteration  the  variances  of  the  parameter  residuals 
increase  until  ultimately  they  no  longer  influence  the  solution.  Although 
Levenberg  derived  damped  least  squares  only  for  the  special  case  in  which  a sin- 
gle, independent  observation  appears  in  each  condition  equation.  ( <7  diagonal;  F 
square,  diagonal),  the  method  can  readily  be  extended  to  hold  for  the  genera] 
case.  Incidentally,  this  provides  an  illustration  of  the  fact  that  many  results 
which  have  been  proven  for  a specific  area  of  least  squares  actually  hold  (per- 
haps with  slight  modification)  for  the  general  case  as  well. 

The  least  squares  adjustment  is  capable  of  the  following  geometrical  inter- 
pretation. Consider  an  n-dimenci onal  coordinate  system  with  orthogonal  axes 
vi  * v2»  •••  > vn  • Then  the  quadratic  form  8 = VT(THV,  being  positive  definite, 
will  represent  an  n-dimensional  ellipsoid  (a  detailed  study  of  the  n-dimensional 
ellipsoid  is  given  by  Wilks  (19U3)  U9]  ).'  The  ellipsoid  is  centered  at  the  ori- 
gin. If  <7  i3  diagonal,  the  axes  of  the  ellipsoid  will  coincide  with  the  coor- 
dinate axes,  while  for  <7  nondiagonal  the  ellipsoid  will  be  in  a tilted  orlenta- 
tion.  It  is  clear  that  by  a rotational  coordinate  transformation,  a tilted 
ellipsoid  can  be  reoriented  into  a standard  position.  Such  a transformation  is 
specified  by  V ‘ RV  where  R is  an  nun  matrix  whose  rows  (or  columns)  are  com- 
posed of  the  normalized  characteristic  vectors  of  <7.  Thus  a problem  involving 
correlated  observations  can  be  reduced  to  one  involving  derived  observations 
which  are  uncorrelated.  The  dimensions  of  the  hyperellipsoid  are,  of  course, 
unaffected  by  a rotation.  In  fact,  the  lengths  of  the  axes  are  directly  propor- 
tional to  the  square  roots  of  the  characteristic  roots  of  (T  . The  constant  of 
proportionality  which  determines  their  absolute  dimensions  is  simply  $l/2  . it 
thus  follows  that  the  volume  of  the  ellipsoid  is  directly  proportional  to  8°^ 
the  complete  expression  for  the  volume  is  given  by  Eurlngton  and  Kay  (1953)  &0]  X 
Therefore,  mini  ml  zing  S is  equivalent  to  minimizing  the  volume  of  the  ellipsoid, 
it  being  understood,  naturally,  that  the  condition  equation  constraints  must  be 
satisfied  by  some  paint  on  the  ellipsoid.  To  siaplify  matters  we  may  assume 
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that  any  parameters  have  been  eliminated  from  the  linearized  condition  equations, 
leaving  r relations  between  the  residuals  alone.  Each  condition  equation  then 
represents  a hyperplane,  and  the  residuals  must  lie  on  the  intersection  of  the  r 
hyperplanes.  Now  consider  the  family  of  hyperellipsoids  defined  by  varying  S . 
The  orientation  and  relative  dimensions  of  such  ellipsoids  will  be  constant,  and 
all  will  be  centered  at  the  origin.  We  may  think  of  the  family  as  being  formed 
by  the  balloonlike  expansion  of  an  initial  infinitesimal  ellipsoid.  Let  the 
ellipsoid  expand  until  it  becomes  tangent  to  the  intersection  of  the  condition 
equation  hyperplanes.  For  this  point  all  the  condition  equations  are  satisfied 
and  the  volume  of  the  ellipsoid  (and  consequently  S ) is  obviously  niniiminu 
Hence  the  coordinates  of  the  point  of  tangency  give  the  most  probable  residuals. 
Thus,  from  a geometrical  point  of  view  the  only  difference  between  the  adjust- 
ment of  correlated  and  uncorrelated  observations  lies  in  the  orientation  of  the 
hyperellipsoid  relative  to  the  coordinate  axes.  To  go  somewhat  further  with  the 
interpretation  we  may  suppose  that  the  ellipsoid  has  been  rotated  into  a standard 
position.  Then  the  ellipsoid  can  be  transformed  to  a hypersphere  by  a sinple 
stretch  transformation.  The  most  probable  set  of  residuals  is  then  given  by  the 
coordinates  of  the  point  on  the  intersection  of  condition  equation  hyperplanes 
which  is  closest  to  the  origin.  The  distance  of  this  point  from  the  origin 
represents  the  sum  of  the  squares  of  the  residuals.  This  interpretation  is  espe- 
cially convenient  for  problems  in  conventional  least  squares,  since  the  ellipsoids 
are  in  standard  position  to  begin  with  and  the  necessary  stretch  is  readily  accom- 
plished by  scaling  the  observations. 


Duane  C.  Brown 
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SUPPLEMENTARY  REMARKS 


After  the  present  report  had  been  readied  for  publication,  the  author 
encountered  the  two  additional  matrix  treatments  of  least  squares  which  follow: 

[21]  C.  R.  Rao,  ’’ADVANCED  STATISTICAL  METHODS  IN  BIOMETRIC  RESEARCH,"  Wiley 

(1952),  ch.  2,3. 

[22]  0.  Kempthome,  "THE  DESIGN  AND  ANALYSIS  OF  EXPERIMENTS,"  Wiley  (1952), 

pp.  5U-66. 

Both  of  these  references  consider  the  special  case  for  which  is  equal  to 
the  unit  matrix  and  Q is  merely  a multiple  of  the  unit  matrix.  ° Rao  extends  his 
results  to  include  the  case  for  which  some  of  the  condition  equations  involve 
parameters  only.  However,  his  F‘  submatrix  for  this  more  general  ~ase  is 
again  the  unit  matrix, 

G.  H.  Weiss  of  BRL  has  pointed  out  that  the  principal  result  of  Section  7 
is  to  be  found  in  the  reference 

[23]  H.  CramSr,  "MATHEMATICAL  METHODS  OF  STATISTICS,"  Princeton  Univ.  Press 

(1916),  pp,  312-313. 

Cram6r  obtains  the  result  quite  simply  by  showing  that  the  characteristic  func- 
tion of  the  transformation  is  that  of  a multinormal  distribution  with  covar- 
iance matrix  of  the  form  (7.16). 

It  has  been  suggested  that  a reference  on  inversion  by  the  method  of  sub- 
matrices would  be  appropriate  in  connection  with  the  derivation  of  equation 
(U.7).  This  is  provided  by 

[2U3  Frazer,  Duncan,  Collar,  "ELEMENTARY  MATRICES,"  Cambridge  Univ.  Press 
(1950),  pp.  112-113. 

Also  in  regal'd  to  (U.7)  it  seems  worthwhile  to  mention  that  the  derivation  i3 
considerably  simplified  if  the  relations  (U-18)  - (U.20)  are  employed.  This  is 
also  true  for  the  reductions  leading  to  equations  (5.22),  (5.25)  and  (5.33). 
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